A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.
Summary Tables
BAM Header Info
Basic metadata extracted from the BAM header.
| Sample Name | Version | Sort Order | Platform | Genome Assembly Identifier | Tools |
|---|---|---|---|---|---|
| HG00096 | 1.0 | coordinate | ILLUMINA | N/A | GenomeAnalysisTK | bam_calculate_bq | bam_count_covariates | bam_mark_duplicates | bam_merge | bam_merge.1 | bam_realignment_around_known_indels | bam_recalibrate_quality_scores | bwa | bwa_aln_fastq | bwa_index | bwa_sam | gatk_target_interval_creator | picard | sam_to_fixed_bam | samtools |
Genome Results Summary
General information from the BAM file.
| Sample Name | File name | Number of reads | Number of mapped reads | Number of mapped paired reads (both in pair) | Number of mapped paired reads (singletons) | Median insert size | Mean mapping quality | % GC | General error rate | Mean coverage | Std coverage | % Duplicated reads | % Chimeras | PF Q30 bases | Total bases |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HG00096 | HG00096.chrom11.ILLUMINA.bwa.GBR.low_coverage.20120522.bam | 6412020 | 6396581 | 6358321 | 15439 | 182 | 2.1 | 40.9 | 0.0 | 0.2 | 1.1 | 1.4 | 0.2 | 567431842 | 641202000 |
Median Coverage Across Reference
This plot displays sequencing coverage across chromosomes, summarized as the median coverage within non-overlapping 3 Mb windows. It helps identify regions with unusually high or low coverage, which may reflect technical biases, duplications, or low-complexity regions.
Mapping Quality Distribution
This plot shows the number of genomic positions for each mapping quality value. Mapping quality reflects the confidence of read alignment, typically ranging from 0 to 60.
Insert Size Distribution
This histogram displays the distribution of insert sizes (distance between paired-end reads). A typical insert size distribution should show a smooth peak.
Mapped Reads Clipping Profile
Shows the percentage of clipped bases at each read position. Clipping (hard or soft) is inferred from the CIGAR string in the BAM file and may indicate sequencing or alignment artifacts.
Mapped Reads Nucleotide Content
Displays the proportion of each nucleotide (A, C, G, T) at each read position. Unbalanced content could indicate biases in sequencing or library preparation.
Homopolymer Indels Distribution
This bar plot shows the number of insertions or deletions (indels) found within homopolymer regions (e.g., stretches of AAAAA) and non-homopolymer regions. High numbers may suggest sequencing issues.
Genome Fraction Coverage
Shows the percentage of the reference genome covered at or above different coverage thresholds. For example, 80% at 25X means 80% of the genome is covered by at least 25 reads.
RSeQC
Evaluates high throughput RNA-seq data.URL: http://rseqc.sourceforge.netDOI: 10.1093/bioinformatics/bts356
Read Distribution
Read Distribution calculates how mapped reads are distributed over genome features. In RNA-seq, typically >70% of reads map to exons, reflecting mature, properly spliced transcripts. A high proportion of intronic or intergenic reads may indicate sample quality issues, contamination, or incomplete splicing. Other sequencing strategies do not rely on these distributions for QC. For instance, WGS typically shows ~1–2% of reads in CDS exons, <1% in 5’ UTRs, ~30–40% in introns, <0.1% in TSS/TES, and ~50–60% intergenic; while WXS often has ~60–80% in CDS exons, <10% in 5’ UTRs, and generally <5% in introns, TSS/TES, or intergenic regions. Always interpret these metrics within the context of the chosen protocol and its expected outcomes.